Shallow Parsing and Text Chunking: a View on Underspecification in Syntax
نویسندگان
چکیده
This paper illustrates a technique of shallow parsing named “text chunking” whereby “parse incompleteness” is reinterpreted as “parse underspecification”. A text is chunked into structured units which can be identified with certainty on the basis of available knowledge. The chunking process stops at that level of granularity beyond which the analysis gets undecidable. We argue that a chunked syntactic representation can usefully be exploited as such for non trivial NLP applications which do not require full text understanding such as automatic lexical acquisition and information retrieval.
منابع مشابه
Three Types of Chunking in Korean and Dependency Analysis Based on Lexical Association
The curtailment of disambiguation decisions is crucial for eecient and precise analysis of sentences in the view of parsing as making a sequence of disambiguation. In this paper we propose three types of chunking in Korean for purpose of the reduction of search space. We present the parsing method based on chunking and the association among chunks and words in a chunk. Test was conducted on 237...
متن کاملText chunking for prosodic phrasing in French
In this paper, we describe experiments in text chunking for prosodic phrasing and generation in French. We present a quick, robust and deterministic parser which uses part-of-speech information and a set of rules, to consistently assign prosodic boundaries in Text-To-Speech synthesis. The syntactic phrasing, consisting of segmenting sentences in non-recursive sequences, is de ned in terms of se...
متن کاملGraph- and surface-level sentence chunking
The computing cost of many NLP tasks increases faster than linearly with the length of the representation of a sentence. For parsing the representation is tokens, while for operations on syntax and semantics it will be more complex. In this paper we propose a new task of sentence chunking: splitting sentence representations into coherent substructures. Its aim is to make further processing of l...
متن کاملIntroduction to the CoNLL-2000 Shared Task Chunking
We describe the CoNLL-2000 shared task: dividing text into syntactically related nonoverlapping groups of words, so-called text chunking. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002